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1. Introduction 

The automation of first order logic has received comparatively 
little attention from researchers intent upon synthesizing the theorem 
proving mechanism used by humans. The dominant point of view [tF], [18] 
has been that theorem proving on the computer should be oriented to the 
capabilities of the computer rather than to the human mind and therefore 
one should not be afraid to provide the computer with a logic that humans 
might find strange and uncomfortable. The preeminence of this point of 
view is not hard to explain since until now the most successful theorem 
proving programs have been machine oriented. 

Nevertheless, there are at least two reasons for being dissatis- 
fied with the machine oriented approach. First, a mathematician often is 
interested more in understanding the proof of a proposition than in being 
told that the proposition is true, for the insight gained fro* an under- 
standing of the proof can lead to the proof of additional propositions and 
the development of new mathematical concepts. However, machine oriented 
proofs can appear very unnatural to a human mathematician thereby provid- 
ing hija with little if any insight. Second, the machine oriented approach 
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has failed to produce a computer program which even comes close to equaling 
a good human mathematician in theorem proving ability; this leads one to 
suspect that perhaps the logic being supplied to the machine is not as effi- 
cient as the logic used by humans. 

The approach taken in this paper has been to develop a theorem 
proving program as a vehicle for gaining a better understanding of how hu- 
mans actually prove theorems- The computer program which has emerged from 
this study is based upon a logic which appears more "natural" to a human 
(i.e., more human oriented). While the program is not yet the equal of a 
top flight human mathematician, it already has given indication (evidence 
of which is presented in section 9) that it can outperform the best machine 
oriented theorem provers. 

Soma work was begun in [2\ t [3J, [7] ( L 10] and [I?] directed 
toward the introduction of a reasoning by cases mechanism into automatic 
theorem proving. One can give many examples where humans use such a mech- 
anism. Thus, in proving set A is identical to set B, usually one will 
attempt to prove two cases: (1) A is a subset of B and (2) B is a subset 
of A. In proving that a system is a group, usually one will attempt to 
prove two cases; (1) if two elements are in the group, then the product 
is in the group and (2) if an element is in the group, then its inverse 
is in the group. In proving a theorem by induction, one will prove a 
basis case (i.*., that it is true for say n»l) and also an induction case 
(i.e. if it is true for n, then it must be true for n+1). 

However, some serious obstacles have prevented the effective 
use of reasoning by cases by previous automatic theorem provers* These 



obstacles relate to the overall (i.e.* global) organization of these pro- 
grams. Any problem solving program must have an executive routine which 
controls the allocation of its computational resources. The executive 
routine must have a procedure for (1) determining when and how to create 
a goal and (2) preventing an explosion of goals from taking place. Pre- 
vious theorem proving executives [31, [10] p [17] made the mistake of asso- 
ciating a goal with every new formula that got generated. By contrast, 
the present program first makes an intensive attempt to solve a goal (which 
typically will generate a number of new formulas for further processing 
under the control of a local executive) before deciding to generate a new 
goal. As viewed by the global executive, new goals get created only when 
a goal gets split into subgoals (where the logical basis for the split is 
a reasoning by cases argument). The rationale behind this is that the 
resources for solving a goal without splitting are considerable {i.e., 
repeated use of modus ponens, the equality inference rule, rules for 
simplifying formulas, etc) and therefore one should provide substantial 
opportunity for solving the goal before introducing additional goals. 



This mistake was not made in [2] and [7J where procedures were 
presented for breaking a theorem into component cases which then are 
attacked by standard theorem proving techniques. However, these procedures 
do not provide the means by which the application of the standard techniques 
can interact with the mechanism that splits the theorem into cases. Thus, 
if the standard techniques are inadequate for solving a case, the informa- 
tion generated during the unsuccessful attempt is lost and will not contri- 
bute to the process of creating new subcases. 



Nevertheless* the splitting of goals into subgoals is often neces- 
sary and if not controlled It could overwhelm the executive with more goals 
than it could handle. The key to controlling this growth (and it has 
demonstrated its value in the present program) stems from the recognition 
of the fact that it is not always necessary to solve each subgoal created 
by a split in order to solve an ancestor goal. For example* suppose the 
program knows that it can prove a goal G by proving A and B. So It first 
attempts to prove A and generates a number of new formulas in this attempt. 
However r suppose that, although it still cannot prove A t it nevertheless 
knows that it also could prove G by proving C and D. The program then would 
attempt to prove C and D and would view these attempts as subgoals of A. 
The reason C and T> are regarded as subgoals of A is that their proofs could 
depend upon information derived from A. In this case, the solution of C 
and D would mean that we still would have to prove B in order to conclude 
G. However, if the proofs of C and D did not utilize information derived 
from A, then we would not have to prove B and could bring the proof of the 
ancestor goal G to an immediate conclusion. Similarly, if the proof of C 
did not depend upon information derived from C, we could skip over the 
proof of D. 

From the viewpoint of human problem solving, we can regard these 
splits as "strategies" by which some ancestor goal is to be solved (it need 
not be the immediate ancestor) . Tne program (and also a human) may have a 
number of different strategies under consideration simultaneously . In the 
course of implementing these strategies, the program discovers which strata* 
gies are of assistance to other strategies* which strategies must be completed, 
and which strategies can be discarded. 



2. Preliminary Concepts 

The reader is referred to til] for a ttgorous treatment of the 
basic definitions and ground rules used by taost programs which prove 
theorems in the first order predicate calculus. An informal description 
is presented in this section in order to make sure that the reader is 
acquainted with the basic ideas. 

We use the four logical connectives *■ ("not"), ^ ("implies") , 
S\ ("and"), v ("or") as well as the universal quantifier v ("for all") 
and the existential quantifier a ("there exists"). We assume an infinite 
supply of variables, constants, functions and predicates- 
Terms: A variable is a term, a constant is a term, and a function 
f(t,,...,t ) is a term provided that its arguments t-,...,t are all terms. 

Atomic formulas : A predicate P{t, t ) is an atomic formula provided 

that its arguments t.,...,t are all terms. 

Literals: An atomic formula A is a literal and the negation of an atomic 
formula *- A is a literal* 

Formulas : An atomic formula is a formula. If A and B are formulas, then 
•v A, A3B, A/\B, and A\/B are formulas. If A(x) is a formula that may 

depend upon some variable x, then vxA(x) and 3xA(x) are formulas. 

Given the desire to prove that the formula A . logically follows 

from the formulas A Jf Aj, - . . ,A n> we will assume that the formulas A-tA*,...^ t 
-* A d+1 are all true and then attempt to derive a contradiction. The first 



step is to convert each of the formulas A^A^^^A ,^A . to prenex form 
using a standard procedure [91* A formula which is in prenex form has all 
its quantifiers (if any) at the front of the formula. The next step is to 
remove first all the existential quantifiers and then all the universal 
quantifiers. An existential quantifier is removed by replacing the existen- 
tial variable it quantifies by a function of those universal variables whose 
quantifiers appear to the left of the existential quantifier in question 
111]. F <>r example, in the formula VuVv3wVxP(u»x,w,v) , the existential quan- 
tifier which quantifies the variable w would be removed and w would be 
replaced by a function of u and v, say f(u,v), since these variables are 
universally quantified to the left of w. The resulting formula therefore 
would be VuVvVx P(u,x,f (u,v) ,v) , Universal quantifiers then are dropped with 
the undQrstanding that the variables in question aw to be given universal 
interpretations (i.e. if A(x) is assumed to be true, then it is assumed to 
be true for all possible values of the variable x). 

At the heart of all current theorem proving programs is the use 
of "matching" routines. For example, suppose the program established that 
the formulas A(x,a,x) and — A(b,y»b) both follow from the assumptions of 
the problem where x and y are variables and a and b are constants, A match- 
ing routine then would determine for the program that variable x should be 
set equal to b and variable y should be set equal to a in order to elimin- 
ate the sources of difference between the formula A(x,a»x) and the formula 
A(b,y.b). Since A(x,a t x) must be true for x-b (as it is true for all values 
of x) and **&(b 9 y 9 b) must be true for y-a (as it is true for all values of y)» 
this means that the formula A{b.a,b) would contradict the formula ^A(b,a,b) 
and the program would have obtained a proof. 



3. The Logical Deduction Rules 

The use of matching routines is implicit in the rules of infer- 
ence to be described in this section. Thus, when expressions E and E* 
appear in the statement of an inference rule, it should be understood that 
E and K 1 represent expressions which have been made identical by means of 
a matching routine. For example, the application of A(b,y,b), A(x,a,x)DB(s) 
to rule R6 would mean that A(b,y,b) would be made identical to A(x,a,x) by 
setting x»b, y-a and this would result in the output B(b) where A(x,a t x)» 
A(b,y,b), B(x) and B(b) play the role of A, A 1 , B, and B 1 respectively. 

As described in section 2, we begin with a set of formulas all 
of which are assumed to be true. New formulas are established with the 
help of rules R2 through R12. Rule Rl determines when a problem has been 
solved. 

Rl. A problem is solved when it has been established that both the 

literals A' and ^A are true. 
R2. Replace formula -~~-A by A. 
R3. Replace formula A^B by A,B. 
RA, Replace formula ^-(AvB) by *~A, ^B (i.e. if on* wishes to 

prove An/B, then either prove A or prove B). 
R5. Replace formula ■- (A^B) by A, ^B (i.e. if one wishes to prove 

a:k, then assume A and prove B) . 
R6. (Modus ponens) If it has been established that A' and A3B are 

true where A is a literal, then add B 1 to the set of true formulas. 
R7. (Modus ponens) If it has been established that -~B* and ADB are 

true where B is an ato=ic formula, then add ^A 1 to the set of true 

formulas. 



R8. (Modus ponens) If it has been established that B* and AZ>^B are 
true where B is an atomic formula, then add ^A 1 to the set of 
true formulas* 
R9. (Reasoning by cases) Split A\/B into case A and case B as shown 
in section 4* 

RIO. (Reasoning by cases) Split ^(A/nB) into case ^A and case -«B 
as shown in section 4. 

Rll. (Equality relation) If it has been established that r-t and A(r') 
are true where A(r') is a literal that depends upon the term r'» 
then add A(t') to the set of true formulas. See section 7 for a 
note canplete discussion of this rule. Also, see section 8 for 
the treatment of functions that are either both associative and 
commutative or just associative as the program has special routines 
which incorporate such functions into the equality relation. 

R12. If it haa been established that P(L IMI| t ) and -P(tJ ,. . . ,t ') 
are true literals where P is not the equality predicate and if 
t. has been made identical to t! by means of a match for all ii*j 
but this match fails for i-j, then add ~-*(t «t!) to the set of 
true formulas. 

Of the machine oriented approaches to automatic theorem proving, 
by far the most important have been those based upon the resolution prin- 
ciple [15] . Although resolution is based upon a single inference rule 
for generating new formulas, it is convenient to regard it as two rules 
— called unit and non-unit resolution respectively. Unit resolution 
says that if it has been established that Aj and ~ A^\y k^\y A^ are 



true where A.,A-,...,A are literals, then wc can add Al \/ A-v A' to 

the set of true formulas. From the standpoint of rule R9, the formula 
'"A* V AaV ....A can be thought of as representing r cases given by the 
r literals — A. , A-,. .. t A - The application of unit resolution to the for- 
mula ^A- k/ A^\y A then has the worthwhile property that it produces 

as output a formula with one less case (i.e. A' V A*\/» ¥r .A' has only 



i- v *\*y A an( * AlvB*\/ B and generates as out- 



r-1 cases). On the other hand, non-unit resolution takes as input the 
two formulas ^K^\y A*\/., 

put Aiv A' vBlv B' which is a formula which rule R9 would 

regard as representing r+t-2 cases. Since r£2 and t^2, it follows that 
r+t-2 Smax(r f t) . This means that the output formula of a non-unit resolu- 
tion will have at least as many and often more cases than either of its 
input formulas. Since the ultimate objective is to eliminate all cases 
from some formula. It is understandable that non-unit resolution would be 
much less effective than unit resolution and indeed researchers soon gave 
preference to unit resolution when generating new formulas [191* 1 4]. Now, 
since ^AVB is logically equivalent to ADB, unit resolution at least is 
related to nodus ponens (see rules R6 through R8) which is a co-mon form 
of hiuian reasoning whereas non-unit resolution appears very unnatural to 
a human. The suspicion is quite strong that humans do not use non-unit 



2 
As in the other deduction rules described in this section, we 

are adopting the convention which considers expressions written as C and 

E 1 as having been :aade identical by rseans of a matching routine. 



resolution when making deductions but do use a form of unit resolution 
precisely because non-unit resolution is so much less efficient than unit 
resolution. 

There is no mechanism in resolution for breaking a difficult 
problem into two or more simpler subproblems* Yet this is a common fea- 
ture of human problem solving. The value of such a mechanism is that it 
is generally easier to solve a number of simple problems than it is to 
solve a single hard problem. The present program uses reasoning by cases 
as the logical basis for generating a goal-subgoal hierarchy and this is 
described in section G. However, the following simple example can help 
to illustrate the main ideas. 

Example: He wish to obtain a contradiction from the following six axioms 
(Al through A6) where x, y* and z represent variables and a» b* and c 
represent constants. 

Al* x*(y*z) - (x*y>*2 

A2* P(x*y)\/ ~P(x)\/"-P(y) 

A3* P(a) 

A4. P(b) 

A5. P(c) 

A6. ~P(a*(b*c)) 

The program first attempts to find a contradiction by repeated 
use of all the deduction rules except R9 and RIO, At this stage, the only 
new formula it can generate is A7 which is obtained by substituting Al into 
A6 using rule Rll* 3 



Strictly speaking, we are substituting the right side of Al into 
A6 after the match between x*(y*2) and a*(b*c) produced x-a, y-b, and 2-c. 



A7. ~?<<a*b)*c) 
Actually* the program would not have considered A7 as distinct from A6 
if it had been told that the symbol * satisfies the associative lav Al 
(see section 8). However, for purposes of exposition, we have assuned 
in this example that the program has no explicit knowledge that * is asso- 
ciative . 

The remainder of this proof can be understood best by referring 
to Figure 1 which describes the goal-subgoal hierarchy that was generated 
from this example* 




The node labeled T in Figure 1 represents the top level goal which consists 
of the desire to obtain a contradiction from the ax i pes Al through A6. 
Since after generating the formula A7 the program finds that it has exhaus- 
ted its resources without obtaining a contradiction , it now turns to a 
reasoning by cases argument (i.e. rules R9 or RIO). Axiom A2 is chosen to 
be split into cases which become subgoals of T as shown in Figure 1, This 
generates A8. 

A8. P(x*y). Case 1 of A2. 

Once again, the program attempts to find a contradiction using all 
the resources at its disposal except rules R9 and RIO. This atteopt dis- 
covers two distinct contradictions. The first contradiction is between A6 
and A8 for x-a, y-b*c and generates the formula A9. 

A9. *-P<aW~*P<b*c}. Cases 2 and 3 of A2 for x-a, y-b*c. 
The second contradiction is between A7 and AS for x-a*b» y-c and generates 
the formula A10. 

A10. ~P(a*bW~P(cK Cases 2 and 3 of A2 for x-a*b, y«c. 

After no more new formulas can be generated nor new contradictions 
obtained, the program discharges all formulas that resulted from the attempt 
to solve case 1 of A2 but formulas A9 and A10 now replace cases 2 and 3 of 
A2. This means that Al through A? together with A9 and A10 are the only 
formulas under consideration by the program once case 1 of A2 has been dis- 
charged. The program now chooses to split A9 into cases as shown in Figure 
1. This generates All* 

All. ~P(a). Case 1 of A9. 
A contradiction is obtained between A3 and All. This results in All being 



discharged and A12 generated, 

A12. ^P(b*c). Case 2 of A9. 

Again, an attempt Is cade to solve this last case (i.e. "prove P(b*c)") 
without further use of a reasoning by cases argument (I.e. rules R9 or RIO). 
This attempt falls. In fact, it even fails to generate any new formulas. 
However, rather than abandon its attempt to solve this case, the program 
splits AlO and considers the formulas A13 and Al$ that result froQ the split 
as subgoals of All as shown in Figure 1. Although the solution of A13 and 
Al§ would solve T, they are treated as subgoals of A12 since the program has 
no a priori way of knowing whether the information derived from A12 will be 
needed in the solutions to A13 and Al£ * m The program now attempts to solve 
A13. 

A13. ~P(aftb). Case 1 of AlO. 

The attempt to solve A13 also fails so an attempt is made to split A2. 
Although the solution of the goals created by the spile of A2 would solve 
T» these goals are attached as subgoals of A13 as shown in Figure 1 since 
information provided by their ancestor goals might help in their solution. 
Indeed, it should be recalled that A2 was split once before and resulted in 
the formulas A9 and AlO. The reason another attempt is made to split A2 is 
that new formulas (i.e. A12 and A13) now are available which were not avail- 
able during tha previous attempt. 

The program now attempts to solve A14, 

AM. P<K t y). Case 1 of A2. 
This second attempt to split A2 results in two new contradictions. The 
first contradiction is between A12 and A14 for x»b, y«c and generates the 
formula A15. 



A15, ~P(b) \/^P(c). Cases 2 and 3 of A2 for x-b, y«c. 
The second contradiction is between A13 and A14 for x-a, y-b and generates 
the formula A16. 

A16. ~P(a)s/ ^P(b). Cases 2 and 3 of A2 for x»a, y»b. 

After no more new formulas can be generated nor new contradictions 
obtained* the program discharges all formulas that resulted from the attempt 
to solve case 1 of A2 but formulas A15 and A16 now replace cases 2 and 3 of 
A2. The program now chooses to split A15 Into subgoals of A13 as shown in 
Figure 1. 

A17. ~P(b). Case 1 of A15. 
A contradiction is obtained between A4 and A17. This results in A17 being 
discharged and A18 generated. 

A18, "*P(c). Case 2 of A15. 
A contradiction is obtained between A5 and A18. Now since both the genera- 
tion of A17 and A18 as well as the solution of these goals in no way depen- 
ded upon the parent goal A13, the program can conclude the proof of A12 
immediately without attempting the proof of A19, By contrast, the solution 
of A12 did involve information derived from A12 since its solution depended 
upon A15 which was derived from A12, However, since A12 is the last goal in 
the split of A9, this allows the program to conclude that it has a solution 
to the top level goal T. 



Although in this simple example the proof of A19 would have been 
trivial, one cannot expect to be so fortunate in general. 



The Above example by Its very simplicity failed to emphasize the 
role played by the equality inference rule and modus ponens in the 
present theorem proving program, a very important feature of the program is 
its mechanism for controlling the number of formulas generated by the 
equality inference rule and this is described in section 7. Also, a formula 
coded as A^B is used with modus ponens as a means for solving a case 
whereas the same formula coded as ^AVB would be used in generating the 
separate cases ^A and B» One encounters a number of formulas such as 
x»y3x*z a y*z whose usefulness lies in the help they provide in solving 
a case rather than as a means for generating separate cases* The program 
has facilities for exploiting such formulas and these facilities are 
described in section 5. 

6. Reasoning by Cases 

The initial data given to the program is the list (Lq»Lj) where L. 
is a list of formulas frois which a contradiction is to be found and L. is an 
empty list. A formula F is removed from L. and prepared as a possible input 
co one or more of the deduction rules* F is not allowed to be of the type 
A\/B or— (A/\B) since we wish to make a concerted attempt to solve the problea 
before splitting it into subproblems. If F can be applied successfully to a rule 
which requires only one input (i.e. rule R2, R3» R4 and R5) , then F is discarded 
and the output from this application of the deduction rule is placed on L. . 
For example, if F is the formula *^(ADB) , then F would be discarded and instead 
the formulas A and — B would be inserted on I« . 

If F cannot be applied successfully to a one input deduction rule, 
then an attempt is made to apply it to a two input rule by using a formula 
from L* as the second input. A successful application to such a rule would 



mean that the output would be placed on L. unless the rule is Rl in which 
case a contradiction would be found. After every formula on L fl has been 
paired with F as the second input to each deduction rule, the formula F 
gets placed on L. A new formula F now is removed from L. and the pro- 
cedure repeats itself until cither a contradiction has been obtained, the 
list L 1 has been exhausted except for formulas of the type A\/h or ~{A/\B) t 
or a time limit has been exceeded. 

If a contradiction cannot be found in the above manner, then an 
appeal must be made to rules R9 or RIO. Rule R9 splits a disjunction of 
the form B i vB 2 V V"* vB r whereas rule RIO splits a negative conjunction 
of the form ^(B./NB^/sB*...,/NB ), If at this stage there are no disjunc- 
tions or negative conjunctions on L. t then the program must admit failure. 
Otherwise, a disjunction or negative conjunction is taken from L. and a 
list K. is constructed as follows. K. is a list of four elements. K. 
(i.e. the first element on list K.) initially represents the list 
IB- pBAiBaiKfl * t B ). K. 2 * s either the symbol V or the symbol /N depending 
upon whether an application of rule R9 or RIO is being made. *L - initially 
is an empty list. K. ■ initially is a list of the variables appearing in 

any of the formulas B^Bj^ 3^ Although the following discussion 

will assume that rule R9 is being applied, the treatment of rule RIO is 
very similar by virtue of the logical equivalence of -«(A^B) with ^Av-B ( 



As will be shown in section 7> a successful application of rule 
Rll to F often will cause F to be discarded immediately in favor of the 
resulting output which is placed on L, . 

The action taken by the program if the time limit is exceeded 
is described at the end of the present section. 
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The state of affairs is summarized now by the list (L fl , K , L , L*) 
where L* is a list consisting of the first formula on list K. - (which 
in this case is B.) together with all the disjunctions and negative 
conjunctions of L. • As before, a proof is attempted first without using 
rules R9 or RIO* Only this time, the output of a deduction rule is 
placed on L* instead of on L , a formula P which Is renoved from L~ and 
applied to the deduction rules is either eventually discarded or trans- 
ferred to L* instead of to T..., and the deduction rules which require 
two inputs will pair the formula F with formulas from either L. or L* 
instead of just from L * 

Suppose a contradiction is obtained. How should we proceed? An 
obvious method would be to (1) remove B. from the list K_ 1 of cases 
which have not yet been solved end place it on the list K- % of cases 
which have been solved already, (2) reestablish L. as it had been just 
prior to the creation of K,, (3) erase L* and replace it with a new 
list L* consisting of the next case B*(which is now the first formula 
on list K- .} together with all the disjunctions and negative conjunc- 
tions of L. , and (4) search for a contradiction in a similar manner as 
in case 1 when B. had been the controlling hypothesis. 

The difficulty with the above four step method is that, in the 
solution to case 1, a variable x appearing in B. may have been set equal 
to some term t; if this variable x also appears in one of the formulas 

B»' S 3> > & » then this specification that x should equal t must 

prevail also in at least one of these subsequent cases. In order to 
insure that x in fact does equal t in this latter case, the variables of 



B , B- , i B are classified into types according to whether the 

variable is to be given a "universal 11 or ,r existential" interpretation. 

Normally, all the variables would have universal interpretations 
(i.e., if formula A(x) is true, then it is true for all values of the 
variable x) , since all the existential variables already were eliminated 
by the procedure described in Section 2. However, in attacking Case 1, 
we now place an existential interpretation on those variables which appear 

in both B. and one or more of the formulas B-, B-, , B (i.e., if 

A(x)vBlx) is assumed to be true, then we must find some value of x which 
will permit a solution to each case) . For this purpose, we will say 
that an existential variable has been specified if it is set equal to a 
term t where t is not a universal variable. 

Now in first attacking Case 1, we seek applications of the deduction 
rules (except rules R9 and RIO) and subject these rules to the restric- 
tions that no existential variable may be specified unless the specifica- 
tion occurs during a successful application of rule Rl (i.e., we specify 
an existential variable only when so doing will assure the proof of a 
subproblem thereby providing a valuable restriction on the generation 
of subproblems}. 

If a successful solution is found which does not specify any 
existential variables, then we proceed directly to Case 2 by the four 



Note that if an existential variable has been set equal to a univer- 
sal variable, then the existential variable is still fr*t to assume any 
value which can be assumed by the universal variable. However, since a 
universal variable by definition can assume any value, this means that 
this existential variable has not really been "tied down" or "specified" 
upori being set equal to the universal variable. 



step method described above (i.e., we assume B2 and d iscard all formulas 
that were generated during case 1 when Bl was the hypothesis of the case). 
The reason we go directly to case 2 when no existential variable was 
specified is that the solution of case 1 would not have committed any 
variable appearing in any of the subsequent cases. 

However, suppose the successful solution necessitated the specifica- 
tion of at least one existential variable. Then we would not proceed 
directly to case 2. Instead, the use of existential variables enables 
the program to find a number of different solutions to case 1 during the 
same attempt at solving the case; this avoids the duplication of effort 
that would appear if the program were to proceed directly to case 2 as 
soon as it found a single solution only to find later that it must 
generate still core solutions to case 1 in order to solve the original 
problem. In particular, let x represent the vector of existential 
variables. Each time a solution to case 1 is obtained which resulted in 
a different specification t of the vector x, we generate the disjunction 
of the remaining cases for this specification (i»c, we generate the 

f omul a B*(t)vB«(t) s/B (t)). After no more solutions can be found, 

the program discharges Bl and all formulas generated from Bl except 

those disjunctions B.(c)vBJt) \/B (t) each of which represents the 

remaining cases associated with a different solution vector t. Rather 
than consider any of the renaining cases 2 through r, the program later 



Any existential variable which was left unspecified by this solu- 
tion would be replaced by a new universal variable in the formula 
B 2 (t)^/B 3 (t) vB r (t). 



will apply a reasoning by cases analysis to one or more of these 

disjunctions I^tOvB^t) v'B (t). However, the original disjunction 

B-^B., s/R still vould be retained for a possible future application 

of rule R9. The reason for this retention is that subsequent applications 
of rules R9 and RIO, by generating a goal-subgoal hierarchy such as 
was illustrated in Figure 1, could provide additional formulas which 
might form the basis for new solutions to case 1 of this disjunction. 

So far wc have assumed that we did not need an additional reasoning 
by cases analysis in order to solve case 1. However, if another applica- 
tion of rule R9 and RIO is needed, the program must decide whether 
the application should be treated as a subgoal of case 1 or whether 

the case analysis of B.\/B- \/B should be abandoned altogether. 

The heuristic that is used to determine whether case 1 is worth pursuing 
is the presence or absence of existential variables in Bl, Thus, 
case 1 vould be abandoned if and only if an existential variable appears 
in Bl, This heuristic also simplifies the programming since it means 
that we do not have to compare existential variables that originated 
from different disjunctions- In any event, the abandonment of case 1 
is not necessarily permanent* Case 1 is just postponed In favor of an 
attempt to solve the cases of some other disjunction or negative 
conjunction. This attempt could provide the material with which to 

achieve solutions to case 1 of B iVB ? vB when and if this 

disjunction is reactivated at a later date* 



9 
If there were no more disjunctions or negative conjunctions, then 

not only would the abandonment of case 1 become permanent but the program 

would terminate its attempt to solve the original problem as well. 



In general, the state of the system is described by a list 

(liQt K,» L»> **• L 2 * K n' *** *n + 1^ vhere K < contains the infor- 
mation controlling the reasoning by cases analysis of some specific 
disjunction or negative conjunction and is defined in a manner similar 
to BL* Thus, K. . represents the list of cases which have not yet been 
solved, K. * is c ^ e symbol \/ or ^ depending upon whether an application 
of rule R9 or RIO is being Taade, K. . is the list of cases which have 
been solved already, and K, , is the list of existential variables 
appearing in formulas on list K. .. The first formula on list K - 

represents the case currently under attack and K - , K , * K 

were generated in an attempt to solve this case. For 1 < 1 < n, formulas 
appearing in L. already have been processed by the deduction rules and 
arc under the immediate control of K. . Formulas appearing in L* were 
processed prior to the application of any reasoning by cases analysis. 
Formulas appearing In L . have not yet been processed by the de- 

duction rules. 

A new case gets Initiated at what is then the lowest level of the 
gcal-subgoal hiearchy* If this lowest level is n, then the empty list 

L 3 - would be created. The first formula on list K - would be placed 
n + 1 n,l * 

on L - if K . - v ; if K o" /x » c ^ en the negation of this formula 

n t x n i * n $ * 

would be placed on L ^ An attempt first is made to solve this case 
using the deduction rules (except rules K9 and RIO) and the output of 
these rules is placed on t . . If a formula F, which appears on 
L ., is applied to a two input deduction rule, then the second in- 
put would come from one of the lists L* , L. , L , If F is neither 

u l n 



a disjunction nor a negative conjunction* then it eventually would be 

either discarded or transferred from L^ , , to L . 

n + 1 n 

Since the application of B-\/ B* ^B to rule R9 means that the 

program mu5t find r solutions instead of just one solution, it is a 

matter of great concern whether B.vB, s/B is really needed for the 

proof; for if it is not needed , then its application to rule R9 could 
be a great waste of effort. Furthermore, if for each of these r cases , 

the program should choose unnecessarily a formula C.vC ? vC for 

application to rule R9» then it would have to find rt solutions when 
only one was really needed. Clearly t the computational effort could 
snowball if the program is not careful about how it applies formulas to 
rules R9 and RIO- This is not just a theoretical possibility. In the 
course of searching for a proof , it is not unusual to generate many 
irrelevant disjunctions and negative conjunctions. Which of the cany 
disjunctions and negative conjunctions is the program to choose for 
applications to rules R9 and RIO? 

One way of attacking this problem is to let the decision to select 
a particular disjunction or negative conjunction represent the node of 
a goal tree. Although the use of goal trees is a common approach in 
artificial intelligence research, it will not work here because a good 
method for evaluating these nodes is not readily available. 

Let us summarize the situation which has just been described. 
Reasoning by cases offers the opportunity to decompose a problem by con- 
sidering each case separately and automatically erasing all formulas 



that were generated during the attempt to solve a case before proceeding 
to any of the subsequent cases. Ctt the other hand, reasoning by cases 
could lead to a disastrous explosion of subproblems under consideration 
by the theorem prover. How then do we prevent such an explosion from 
taking place? 

The escape from our dilemma turns out to be surprisingly simple 
and one which is likely used by human theorem provers as well- The basic 
idea is to determine whether the solution to a case depends upon the 
hypothesis of the case; for if no such dependence is found* then none 
of the subsequent cases need be considered. 

Recall that the system is described by a list (L Q> K^ L^ K 2 , L 2> 
, K * L » L -)where K, controls the case analysis at the 1 



level of the goal-subgoal hierarchy. We say that formula F depends up 



en 



K if either F is the hypothesis of one of the cases of K. or else the 
derivation of F utilized this hypothesis at least once as an input to 
one of the deduction rules. We then define D(F) ■ dependence of formula 
F = the set of those K. upon which F depends. We next define D. - 
dependence of the solution to case t of K » the union of all D(F) taken 
over all formulas F that appeared in the solution to case t of R.. In 
particular, suppose the solution to case t of K. itself utilized a 
reasoning by cases argument. Then this additional case analysis nust 
have been solved under the control of some K . . . Letting n. . 
represent the number of cases that were solved in the case analysis of 
K> . , and letting D, - * represent the dependence of the particular 
disjunction or negative conjunction which generated K * , we would 



obtain D as the union of all D . taken over all integers j £or 

which < J < n £ + l 

Upon obtaining a solution to case t of K , the program asks "Is 
K a member of D. 7" If the answer is yes, the program goes directly 
to the next case of K . However, if the answer is no, then it skips 
over all the remaining cases of K. and instead Immediately concludes 
that It has solved the currently active case of K. .. 

Furthermore, before a case of K. actually is attempted, the program 
first examines the lists K ^ for all j < i in order to determine whether 
the case had been solved already. If the answer is yes, the program 
assigns the same dependence to the solution of this case as prevailed 
for its previously solved duplicate and then skips over this case by 
proceeding directly to the next case of K . . 

The program also would not attempt to split a formula F into cases 
if one of the cases of F represented a specialization of a literal 
already appearing on some h ± (i.e., there is no point in examining any 
of the cases of F unless each of these cases is supplying us with some 
new information). For this reason, a further restriction is placed 
upon the specification of an existential variable, A specification of 
an existential variable is allowed by K only if it does not transform 
a subsequent case of K into a specialization of a literal already 
appearing on an L for some i < n- 



We have said nothing as yet about the order by which disjunctions 
and negative conjunctions got activated by rules R9 and R10» Associated 
with every goal G is the list S(G) of those disjunctions and negative 
conjunctions that were available to G at the tine of Its activation. 
The attempt at solving G without utilizing rules R9 and RIO results in a 
new list T(G) consisting of the elements of S(G) followed by those dis- 
junctions and negative conjunctions that were created during this attempt. 
If it is decided to sprout subgoals from G, then the first available 
formula F on list T(G) becomes the instrument for the split. Suppose 
G 1 is one of the cases obtained from the split of F. If no two cases 
of F have the same variable in consaon, then S(G') would be T(G) with 
formula F removed as there then would be no need to split F more than 
once* However, if the same variable appears in more than one case of F, 
then we may wish to seek additional splits of F at a later date and so 
SCG 1 } would be T(G) with forr.ula F transferred from the first to the 
last element in the list* 



If the attempt to solve G 1 without R9 and RIO fails to find a 
single new solution and an existential variable appears in G' t then F 
is abandoned and a special mark is placed on F to inform the program that 
F no longer is available for a split. However* if the attempt to solve 
G* without R9 and RIO fails to find a solution but no existential variable 
appears in G f , then G 1 would become a parent goal whereupon this special 
mark would be removed from all formulas of T<G) . The rationale for re- 
moving these marks (and thereby providing new opportunities for splits) 
is that the use of G' as a parent goal would provide new information that 
had not been available when these marks originally were inserted* 



There is still one loose end that needs to be tied. We have said 
that in attempting to solve a case we first seek a solution using all 
the deduction rules except R9 and RIO but impose a maximum limit on the 
time spent looking for such a solution. The reason for this maximum 
time limit is that we do not wish to commit too much of our computational 
resources in this attempt when additional use of rules R9 or RIO may 
be necessary. Therefore, if the time limit expires before a solution 
can be found and the program does not wish to abandon the goal, it would 

remove from the system those formulas F 1 , F~, p F which have not 

yet been processed by the deduction rules and combine then into a single 

compound formula Ma ■ a)\/(F./^F* /NF ) . This compound formula 

then is placed at the bottom of the list of disjunctions and negative 
conjunctions that are associated with the current goal where it would 
be applied to rule R9 only after those formulas which precede it on the 
list. This allows the program to continue work on a goal by utilizing 
a reasoning by cases argusnent without waiting to process all the for- 
mulas F. , F*, i F • If it should turn out that one of the formulas 

F., F-, , F is necessary for the solution to the goal, then 

eventually the disjunction M* * i)\s(T-/\F~ ^F ) would get split 

by rule R9. In that event, case 1 consisting of the hypothesis Ma m a) 

would be solved trivially and PjA'j ^F (the hypothesis of case 2) 

would be decomposed by rule R3 thereby reestablishing the formulas 

P 1« P 2 ■'«• 



5. Seme Facilities for Representing Procedural Information 

When presented with a new axiom, a human often will have some ideas 
about how the axiom Id to be used* A program that can absorb these 
ideas has an advantage when it comes to solving actual problems. The 
present program has three main channels through which such information 
can be received and utilized. 

First, as mentioned in Section 3, the procedure by which an axiom 
A^B is used in a problem is different from the procedure associated with 
the logically equivalent axiom ^AVB; the first formulation is used to 
help solve a case* whereas the second formulation is used to split a 
problem into separate cases • 

Second, a routine, associated with an input F to a two input 
deduction rule, can decide whether a particular formula should be 
paired with F as the second input to the rule. Third, descriptive 
information, associated with one of the inputs, can be passed along to 
the output of a deduction rule; this descriptive information might be 
a factor later in deciding whether this output formula should be accepted 
as a second input to a particular two input rule. 

For example, before the formulas ADB and A 1 are applied to rule R6, 
the program checks to see whether a special attribute appears with ADB. 
If this attribute is not present, then the program would continue its 
attempt to apply A3>B and A' to rule R6- However, if this attribute 
does appear with A^>B, it would have some IPL-V routine R associated 
with it; the program then would execute the routine R using the formula 



A* as input data to the routine. Under these circumstances , the 
decision to continue (with this application of ADB and A* to rule R6) 
vould be made on the basis of the result obtained fron this execution 
of routine R. Thus, a routine R associated with the formula xEGZjx CG 
might reject (a ) eG as the second input to rule R6 in order to 
avoid an endless application of rule R6 to xCGDx CG. Another way to 
handle this example stems from the fact that any attribute appearing 
with formula B gets transmitted to the output formula B* upon the 
successful application of A Z>B and A' to rule R6, Thus, the routine 
R might reject A 1 if it determined that the creation of A* occurred as 
the output from a previously successful application of rule R6 to x£GD 
x EG; routine R could make such a determination merely by checking 
to see whether a special attribute that appears with x EG appears also 
with A'. 

There is one attribute, known as the "expansion" attribute, which 
Is processed by the program in a special way. Thus, if the expansion 
attribute appears with a literal, then the program does not allow the 
literal to be used as an input to any of the deduction rules except 
rule RI1 and then only in the role of A(r'). After each previously 
generated formula of the type r - C has had a chance to be paired with 
this expansion literal for a possible application to Rll, the expansion 
literal is removed from the system. For example, the placement of the 
expansion attribute with x*a > y*z in the formula x > y ^x*z > y*2 
would, cause this latter formula to generate expansions 1 . Thus, formulas 
b > c and x > yDx« > y*z applied to rule R6 would generate the 



expansion b*c > c*z. Similarly, the application of Mb - c) with the can- 
cellation lav x*z > y*z z> x • y to rule R7 would generate -v(b*a - c*z> 
as an expansion. 

6. The Problem Solving Executive 

Any problem solving program must hove soroe overall scheme for 
allocating its computational resources. This "global" allocation 
already has been described for the present program in Section 4 in 
connection with the implementation of rules R9 and RIO, However, the 
present program also must conduct "local" searches which are character- 
ized by an attempt to find a contradiction without further use of rules 
R9 or RIO. These local searches must likewise have a procedure that 
controls and guides the computational effort; we describe now this 
local allocation. 

A non-literal is given priority ahead of a literal when it comes 
to deciding the next formula to be removed from L . and applied 

to the deduction rules. Among non-literals* the order is first come 
first served. Among literals, priority is determined on the basis of 
a lexicographic ordering which chooses the literal which (1) depends 
upon the fewest number of the K (i.e., so that a solution obtained 
from this literal would have a better chance of not necessitating the 
solution of too many additional cases) f and in the event of a tie 



However, we would not generate the expansion if either b or c 
were of the form u*v where u and v were existential variables since we 
wish to avoid the effort of trying to create a split by solving for 
one existential variable in term of the other. 



attempts to choose a literal which (2) does not possess the expansion 
attribute, and in case of still another tie chooses the literal which 
(3) has the least complexity where complexity is measured by the storage 
space taken up by the literal. 

However, in trying to find a direct solution to a case involving 
existential variables, we do not process any expansions if a solution 
already has been obtained. The reason for this is that the processing 
of expansions is too expensive to justify their use when looking for 
additional splits since we can do so at a later date if the current 
split should prove insufficient. 

7. The Equality Relation 

It should be noted that unlike the treatment of equality by resolu- 
tion based theorem provers [13], rule Rll does not permit the replace- 
ment of a tern in a fonnula unless that formula is a literal nor does 
it allow this replacement to be made on the basis of an equality r ■ t 
unless the truth of r - t already has been established. The reason 
we can do this is that the reasoning by cases mechanism serves to detach 
the individual literals from a formula as it analyzes the separate 
cases so that eventually these literals will be available for use by 
rule Bill However, the advantage in postponing these replacements is 
chat it keeps apart those formulas used in solving one case from for- 
mulas used in solving a later case with the result that formulas stemming 
fro» different cases do not interact with each other to produce additional 
formulas. Also, since the program does not always have to consider 



all cases arising from a split (as was discussed in Section U) f this 
postponement enables the program to avoid the necessity of generating 
fomulas from a later case if the current case should prove to be 
irrelevant to the solution of a higher level goal- 

In rule Rll we impose the requirement that neither r nor r 1 can 

be variables; for if either r or r' were a variable* then the match of 

12 
r with r* always would be trivially satisfied- ' The significance of 

this restriction is that it limits the application of rule Rll to 

situations where its successful application would provide us with some 

"information" in the sense of [16] (1-C-, the success of a rule gives 

us no "information" if its success is a foregone conclusion)- The 

practical usefulness of this restriction is that it greatly reduces 

Che number of formulas generated by rule Rll with little risk that 

one of the discarded formulas will be necessary for the solution* 

In applying the equality b B c as an input to rule Rll, the 

program will identify b with r and c with t (i-e., it will replace b 

by c rather than replace c by b) on the basis of the following eight 

conditions, where condition i takes priority over condition j for i < j 

(1) c appears as part of term b, 

(2) nore variables appear in b than in c» 

(3) c is a constant which appears in a special list provided to the 
program by the user (so far, this list has consisted only of identity 



This assumes of course that r does not appear aB a subeleoent of 
r 1 and vice versa* 



symbols such as and I), 

(4> a special attribute appears with the equality b ■ c which telle 

the program that the right side of the equality should be substituted 

for the left side (so far, this attribute has been used just once 

and that was to denote that b had significance only in its capacity as 

the definition of c) , 

(5) b> but not c, represents an associative product (i.e., b is of 
the form r*t where * obeys the associative law; see Section 8 for 
discussion of associativity) , 

(6) neither b nor c represent an associative product, but b is a 
function of more arguments than c» 

(7) both b and c represent associative products such as a.*a- *a 

where n, the number of terms in the product, is greater for b than it 
is for c, and 

(8) b is of greater "complexity" than c where, as in Section 6, the 
complexity of an expression is izaasured by the storage space it 
occupies. ■ *--'' ;v ;j.i — 

If neither b nor c can be Identified with r on the basis of the 
above eight conditions, then an arbitrary choice is made. If a decision 
is made to identify b with r but an attribute appearing with formula 
b - c indicates a desire that both sides of the equality be given this 
opportunity, then c also would be raatched with r and if successful 
the output A(t') would be designated as an expansion. 



If in rule Rll (1) one of the above eight conditions docs prevail 
for r * t, (2) A(r') is not an expansion^ and (3) the matching of r to 
t 1 does not reduce the generality of r* » then the generation of A(t') 
by rule Rll allows us with reasonable confidence to eliminate A(r') 
from further consideration provided that the new formula A(t') does not 
depend upon any of the K, not already depended upon by A(r'). The 
program takes advantage of this by employing rule Rll as the first 
deduction rule to be applied to a literal; the literal is given the 
role of A(r') in rule Rll and different equalities r ■ t are paired 
with it in the hope that one of these equalities will lead to a quick 
elimination of the literal. Indeed, if the application of one of these 
equalities to rule Rll generated the literal 1/ from the literal L 
without eliminating i but a subsequent application of a different 
equality to rule Rll did eliminate L t then the program would eliminate 
the literal V as well. 

We have already restricted the application of rule Rll by not 
allowing the specification of an existential variable (i.e., we required 
In Section 4 that an existential variable could be specified only if 
the specification occurred during an application of rule Rl). We now 
place a further restriction on rule Rll by not allowing any variable from A(r') 
to be specified!, (i.e.* the match of r to r f in rule Rll is not 
allowed if it reduces the generality of r') unless A(r 1 ) is either an 
equality b « c, or its negation *^(b ■ c) » or an expansion. In order to 
compensate for these restrictions on the treatment of literals which 



do not involve the equality predicate , we included R12 as a rule of 
inference as this rule takes two such literals as input and produces 
the negation of an equality as output. However, since one of the 
motivations for rule R12 was its compensation for the restriction of 
substitutions into variables, we will require that at least one of the 
two input literals to rule R12 must possess a variable. 

8. Associativity and Conautativity 

Functions which are either associative and commutative or just 
associative play a fundamental role in mathematical reasoning. For 
example, addition and multiplication in ordinary arithmetic each 
satisfy both the associative and commutative laws. On the other hand, 
the multiplication of operators (such as found in matrix multiplication) 
provide important examples of functions which satisfy the associative 
but not the commutative law. In view of the great importance of 
associativity and coranutativity, special routines were built into the 
program in order to provide a more accurate simulation of human problem 
solving as well as to exploit better the power which is available 
whenever it is known that a particular function satisfies either both 
the associative and commutative laws or just the associative law. 

An associative function f is one which depends on two arguments 
and satisfies the relationship f(x, f(y,z>) • f(f(x,y),x) for all x, y, 
and z. It follows from this that the expressions f (a,f (b,f (c,d))) » 
f(f(a,b>,£(c,d))» and f (f (f (a,b) ,c),d) are all equivalent if f Is 
associative* Using the more familiar product symbol * in place of f , 



the above expression can be written as a*(b*(c*d)) , (a*b)*(c*d), and 

((a*b)*c)*d respectively. Clearly* the key feature of on associative 

product is that it is independent of the way the parentheses are 

grouped. A human who uses an associative produce acknowledges this 

fact by writing the above expressions as a*b*c*d; at one stroke he 

therefore saves processing time as well as memory by avoiding the 

necessity of treating these equivalent expressions as distinct entities* 

Although an associative product is defined formally as a function of 

two arguments, it is used by humans informally as if it were a function 

of an indeterminate number of arguments s. , s„ , ..«..* s where m can 

be any integer greater than 1. The same point of view is adopted by 

the program which, for an associative function f , strips the parentheses 

from different expressions involving f by reducing them to the canonical fOTTQ 

f (s, , 0* v **iaaj s ). Thus, the program immediately would reduce 

{({(s^, fU 5 , s 3 )) f f(f(s 1# s 2 >> s & )) to f(s 4 , e 5 , s 3 , s lf s 2 , s & ) 

if it knew f to be associative. Throughout the remainder of this section 

it will be assuaed that the function f is associative* 

Standard match routines, such as described in Section 2, can reduce 
and eventually eliminate the differences between two expressions A and 
A* only if A and A' have the same structure (i.e., only if in those 
places where the two expressions A and A 1 differ, a variable appears in 
one expression which can be equated to the tera appearing in the 
corresponding part of the other expression). However, for dealing with 
associativity (and especially for comsutativicy) a more generalized 
method of matching is useful (such as the pattern matching of [£]) 



which can rearrange the position of terms within a structure as well 

as determine values for variables located at fixed positions- 

The present program utilizes a routine MATCHA which can bring 

into correspondence two associative functions f(s., t s > and 

X m 

f(t-, t ) even though m and n may not be equal to each other. 

The execution of . HATCHA(f (s 1 , s ), f(t. t , t }> operates 

as follows. For purposes of exposition, we extend the definition of 

f to allow it to depend on only a single argument by defining 

f (s.) = s.. With no loss of generality, we assume m < n. If m » 1, 

an attempt is made to bring fl, into correspondence with f(t., » t ), 

perhaps by a substitution of certain terms for variables, after which 

an exit is made from routine MATCHA. Suppose m > 1. We first attempt 

to find a substitution which will make s. identical to t. and if 

successful, we then execute MATCHA(f(s* , s ), f<t* t , t )) 

I m 2 n 

for this substitution. After . MATCllA(f (s^* s }, f (tj, t )) 

has been executed, we undo any substitution of a term for a variable 
that might have been needed to make s, identical to t- . At this point, 
if m - n or s. is not a variable, we exit from the routine MATCHA. Other- 
wise, beginning with r ■ 2 , we set the variable s. equal to the term 
ff(tji.»< t ) (provided of course that s. does not appear in f(t.,... t )) 

and then execute MATCHA(f<s„, s ), f<t L lt t )) for this 

£ o r + 1 n 

substitution; after this has been tried for all integers r such 
that r > 2 and m - 2 < n - (r+1), an. exit is made from the routine 
MATCHA, For example, if x and y are variables, then the execution of 



HATCHA(f (x,y) >f (a,b,c)) would produce two successful matches corresponding 
to x ■ a, y - f{b f c) and x - f{a,b), y ■ c. Similarly, the execution 
of MATCHA(f (x»a f x f a>b) ,f (b,c,a t b,c,y)) would produce only one successful 
match (i.e., for x - f(b r c), y » f<a,b)). 

The replacement mechanism in rule R 1 1 Is designed to take advan- 
tage of knowledge that a function f Is associative. Thus, suppose r and 

r 1 in rule Rl 1 are terms of the form f(r, ,.---, r ) and ffrj,...., r') 

1 m In 

respectively where m < n. For each integer j such that < j < n - m p 
the program would attempt to find a substitution which would make r. 

identical to r' for all i ■ 1 ,2 m. If the program is successful 

for some J, then the output of rule Rll for this j would be of the form 
A(t') if m = n, MfttV^,,....^)) if - J < n - m, 

A(f(rj r j" t '» r j+ B H r n*J if < j < n - m, and 

A(f (rj M .».r' t 1 )) if < J ■ n - m. For example, f (x,x) ■ e would cause 
A(f (a,b,b f c)) to be replaced by A{f {a,c t c)) where m « 2, n - 4 and j * 1 , 
An additional replacement routine is available for use when m< n 
and scene r. is a universal variable. Thus, for each integer j such that 
0< j < n - m, the program would execute a routine 

HATCHB(f(r ]t r m'* f(r j+l' r n** * The ^wllne HATCHB is the same as 

routine HATCHA except that (I) it does not allow a substitution which 

reduces the generality of f (r| + . »*•*•!■■) and (2) it proceeds 

differently when it reaches the point where, for some p > j- + m - 1, 

it must execute KATCHB(f (r ) ,f (r' . rl)). Instead of execut- 

«n p+ ( n 

ing HATCHBtftr^.ffr^, r J)) by trying to bring r ffl into corres- 



pondence with f(r* j,*..,^) (as MATCHA would have done) it tries 

to bring r into correspondence with f(r" p....,r') for some integer 

q such that p < q < n. If the program is successful for some j and q, 

then the output of rule Rll would be A(f(t')> if ■ j < q « n, 

A(f(t\r* + x ..i.^» if - j < q < n, A(f(r[ r j , t\ r^ + j 

r^>) if < j < q < n, and A(f(rj rl.t 1 )) if < j < q » n. For 

example, f(x t x) ■ e would cause A(f (a t b r c t b,c,a)) to be replaced by 

A(f(a t e,a)), where m = 2 , n - 6, j - 1, p - 3 and q - 5, 

For the remainder of this section it will be assumed that the 

function f also satisfies the commutative law (i.e., f(x,y) - f(y,x) 

for all x and y). The program uses the routine MATCHC to bring into 

correspondence two functions f(s. *.*.., s ); and f(t.,...,.t ) when it is 

1 x m _ J. n 

known that f is commutative as well as associative. The execution 

of MATCHC(f (6 L ,...,S m ), £(t.»***t )) operates as follows. With 

no loss of generality, we as suae n < n. If m - 1^ an attempt is made 

to bring s- into correspondence with f(t-,.-*t ) after which an exit 

± in 

is made from routine MATCHC. Suppose a > 1* We first attempt to 
find an s, (giving priority to those s, which are not variables) 
for which a substitution can be found that makes s. identical to t. 
for some t.. If we are unsuccessful, we exit from th* routine MATCHC. 
However, if we are successful for some s, and t . , we execute MATCHC 

(f(s l"-"' 8 i - l» s i + 1 V* E<t l c j - 1' 'j + 1 fi n H 

for this substitution and then exit from the routine MATCHC. 

The execution of rule Rll, when it is known that f Is cocautative 

as well as associative, is governed by the routine REPLACE. Thus, 



suppose r and r* in rule Rll are terms of thfi form f(r 1> .*..»r ) and 
£(rj ». •* ■ »**) respectively where m < n. The execution of REPLACE 

(f(r-.i->»»»t )* HrJ r')) operates as follows. Beginning with 

j - 1, the program attempts to find a substitution which would make r. 
identical to r*. If the program is successful and m > 1, it would 

execute : REPXACE<f(r 2 .r^), f(rj ,rj _ ±M rj + ± r^)) 

for this substitution. If the program is successful and m ■ 1, then 



the output of rule Rll for this j would be of the form A{f(t f , £*,...., 

r' , , r! r n^* Thc P ro 8 raiD carries out this procedure for 

each integer j such that 1 < J < n (unless a j is found which produces 
an output for rule Rll that allows A(r') to be eliminated). .Por^r 
example, f(x,x) • e would cause A(f (b,a,b f c)) to be replaced by 
A(f(a,*,c)). 

9. Computational Experience 

The theorem proving program described in this paper was written 
in IPL-V and run on IBM 360/50 and 370/145 computers. The maximum 
partition available to the program was 250*000 bytes of core storage. 
This amounted to 22,000 IPL-V words (after loading the IPL-V interpreter) 
of which 7,000 were consumed in loading the program leaving 15,000 
IPL-V words for actual work space* Although the available memory was 
not large by current standards, computing tice was more of a limiting 
factor than memory, for IPL-V is an interpretive language and therefore 
very slow in execution. The program might well have run an order of 
magnitude faster if it had been written in a language that was capable 
of execution in a compiled form. The computing times of the examples 



reported in this section are for the 370/14S. 

Among the theorems proved by the program were all nine problems 
from group theory and number theory which were reported in [4j. It is 
also capable of solving much more difficult problems than these as 
evidenced by the examples to be described in this section. It accom- 
plished this without the use of any bounds on substitutions. 

The following interpretations arc used with the examples of this 
section: 

(1) x e y means "x is a member of set y" t 

(2) xC y means "x is a subset of y", 

(3) x = y means "set x is identical to set y'\ 

(4) x Kjy means " the union of sets x and y", 

(5) at n y means "the intersection of sets x and y", 

(6) x - y means "the set obtained by taking set x and removing from 
it all elements that appear in set y", 

(7) U means " the universal set consisting of all elements"* 

(8) SB(x) means "the set of all subsets of x" t 

(9) * means " the associative product for the group", 

(10) e means " the identity element for the group", 

(11) s(x) means tf x is a subgroup of the group" 

(12) p(X,Y) menas "the product set XY consisting of all elements x*y 
such that x c X and y € Y", 

(13) prime (x) means '*x is a prime number"* 

(14) x|y means "x divides y" t 



(15) rationel(x) means "x is a rational number"* and 

(16) fiqrt(x) means "the souare root of x". 

Example 1 : The set of all subsets of A intersected with tbr set of 

all subsets of B is identical to the set of all subsets of A intersection fl H 

Al. I (f (x,y) ex ^ f(x,y) ey) a (f<x t y) ey D f (x,y) ex) J 3* = y 

A2, x = y3 f (zf.x 3 zey) A (xfiy 3 zex) J 

A3. (zex Azey) ^zexfly 

A4, zex n y o (z€x /\zcy) 

AS. (zCxAzcy)DzC xOy 
A6. zCxHy d (zc xazc y) 
A7. Ucxazsu) 2 2E SB(x> 

A8> zesB(x) O (aoc^zeu) 
A9. ^(SB(A) O SB<B> 5 SB(A OB)) 

The program proved exanple 1 In one minute from tbe above set of 
initial axioms and generated 36 new formulas in the process. Although 
this theorem had been proved in [2], that program utilized routines 
that were especially designed for set theory . However, this has not been 
an easy theorem for general purpose theorem provers. Vhat is perhaps 
remarkable about the effort of the present program is that it did 
not generate a single formula that was not necessary for the proof 
(i.e., each of the 36 additional formulas generated belonged to the 36 
step proof produced by the program). 



Exarpie 2: The square root of CTBIJ prise DUMbex ifl irriit-oii.il. 

Al. x*(y**) - <x*y)*z 

A2- x*y - y*x 

A3, x - yDx*z - y*z 

A4. x*z - y*z o x " y 

A3. e*x - x 

A6. y*x = x Dy = e 

A7. sqrt(x)*sart (x) - x- - 

AS. y/y*x 

A9. y/xDx = y*h(x,y) 

A10. prime (*)0 (Mx/y*z) \/x/y vx/z) 

All. ■* prime(e) 

A12, rational (x) 3 (f(x) - x*g(x)^(-v(y/f (x)}v^(y/g(x)) ^ y - e)) 

A13. ^ (prime (a) 3 ^rationol(scrt(a))) 

The program proved exanple 2 in 26 minutes and generated - 536 n«v 
formulas in the process. Although example 2 has been the object of 
considerable attention in the literature [14], (5]> [12], the present 
program is the first to prove this theorem without the aid of special 
hints that reflected a previous knowledge of the proof- The program, 
produced a 30 step proof which is reproduced below. Since the basic 
operations for equality, such as its reflexivity x = x, are implicit in 
the operation of the program, they are not mentioned directly in the 
proof- Also, no direct mention is made of Al and A2 since the effect 
of these axiozas is implicit in the choice of match and replace routines 



used by the program as described in Section 8. 

Proof of Example 2 : 

A14, pritae(a) Rule R5 applied to A13. 

A15- ^ ** rational (sqrt (a)) Role R5 applied to A13. 

A16. rational (sqrt(a)) Rule R2 applied to A15. 

A17> Ma/y«) va/y va/z Rule R6 applied to A10 and A14. 

A18. (f (sqrt(a)) = eqrt(a)*g(sqrt<a)}) r\ (^(y/f fsqrt<a}}) 

%>^{y/g(sqrt(a))J\^y - e) Rule R6 applied to A12 and A16, 
A19. f (sqrt(a)) = sqrt (a)*g(sqrt(a)) Rule R3 applied to A18. 
A20. ^{y/f (sqrt(a)))\/^(y/g(»qrt(a)))\/ y = c Rule R3 applied to 

A18. 
A21. sqrt(x)*sqrt(x)*z - x*z Rule R6 applied to A3 and A7. 

A22, sqrt(a)*g(sqrt(a))*z - f {sqrt(a))*z Rule R6 applied to A3 

and A19. 

A23. sqrt(a)*f (sqrt(a)J - a*g(*qrt(a)J Substitution of A19 into A21 
using Rll. 

A24. a*g(sqrt(a))*g(sqrt(a)) - f (sqrt(a))*f {sqrt<a)) Substitution of 

A23 into A22 using Rll. 
A25. Ma/y*z> Case 1 of A17. 
A26, ^<a*x * y*z) where y and z are the san>e variables that appeared 

in A25 (i.e., y and z have been given an existential interpretation) . 

Rule R12 applied to AB and A25. A contradiction is obtained between 

A26 and A26 f or y = z - f(sqrt(a)) and x - g(sqrt(a)) *gfsqrt(a>) . 

This causes A25 and A26 to be replaced by A27. 



A27. a/f (sqrc(a)J sya/f [sqrt{a>) Cases 2 and 3 of A17 for y = z = 

f(sqrt(a)). 
A28. 'v(y/f{sqrt<a)}) Case 1 of A20. 

A29. My*x • f (sqrt(a)Jj where y is the same variable that appeared 
In A28. Rule R12 applied to Aft and A28, A contradiction is 
obtained between A19 and A29 for y a sort (a) and x - g(sqrt(a)J. 
This causes A2S and A29 to be replaced by A30- 
A30. ^{sqrt(a)/g (sqrt (a) }} V (sqrt{a) = e) Case 2 and 3 of A20 

for y ■ sqrt (a) ■ 
A31* a/f(sqrt(a)] Case 1 of A27. 

A32. f(sqrt(a)) - a*h(f (sqrt(a>) »a) Rule R6 applied to A9 and A31. 
A33. a*h(f(sqrt(a)},a)« - f(sqrt(a))*z Rule R6 applied to A3 and 

A32. 
A3*. a*h(f(sqrt(a>) ,a)*sqrt(a) - a*g(sqrt(a)) Substitution of A23 

into A33 using rule Rll. 
A35. h(f (sqrt(a>) »a)*sqrt(a) - g(sqrt(a)). Rule R6 applied to A4 

and A34. 
A36. ^(sqrt(s)/g(sqrt<a))) Cue 1 of A30. 
A37, ^fsqrt(a)*x - g(sqrt(a))) Rule R12 applied to AS and A36. A 

contradiction is obtained between A35 and A37. This causes A36 
and A37 to be replaced by A38. 
A38* sqrt(a) » e Case 2 of A30. 

A39, e*sqrt{a) » a Substitution of A38 into A7 using rule Rll. 
A40. sqrt(a) ■ a Substitution of A5 into A39 using Rule Rll. 



A41. e - a Substitution of A38 into A60 using rule RU. 

142. prire(e). Substitution of A41 into A14 using rule Rll. 

A contradiction is obtained between All and A42. This causes A31 
through A42 to be replaced by A63. 

A43. a/f (sqrt(a)) Case 2 of A27. The progran solved this last 

case merely by noting that it is the same as a case which had been 
solved previously (i.e., it is the same as A31) . The proof of 
the theorem now is complete since all outstanding cases have been 
solved* 

Example 3 : Grau's three axioms are sufficient to define a ternary 
boolean algebra. 

The following five axioms define a ternary boolean algebra. 
Al, f(f(x,y,u), v, f(x,y f w)) - f (x t y,f <u,v,w)) 
A2. f(y,x,x) - x 
A3. f(x >y ,g(y))- x 
Bl, f(x»x»y) ■ x 
B2. f(g(y),y, x ) - x . 

The object is to show that axioms Bl and B2 both follow from axioms Al 
through A3- That this .in fact -could be*, done was announced in the 
mathematical literature [6J but no proof was presented. It was proved 
subsequently by an interactive theorem proving program [11 which 
utilized a hu&an participant in the proof finding process. The following 
18 step, proof is quite different from the one in [1] and did not 
involve any human participation. The program is presented with axioms 



Al through A3 as well as the denial of B2. It was not necessary to 

deny Bl since Bl was produced in the course of orovin^ B2. The proof 

took 110 minutes during which 245 nev formulas were created, 

A4, ^(f (g(a)»a,b) - b] Denial of B2. 

A5, f (y,v»f<x,y,v)) - f (x,y,f (y,v»w)) Substitution of A2 into 

left side of Al using rule Rll, 
A6. f(y,v,y) - f (x»y 1 f(y t v,y)) Substitution of A2 into left side of 

A5 using rule Rll, 
A7, f (y,v,f(x,y t v)} - f(K»y,v) Substitution of A2 Into right side of 

A5 using rule Rll. 
A8, f(f(v f w f u) f v.fCvjWjW)) - f<u,v,w) Substitution of A7 into right 

side of Al using rule Rll, 
A9, f(f{v l w t u}, v,w) - f(u,v,w) Substitution of A2 into AS using 

rule Rll, 
A10. f(x,x,y) - f(g<y>,x,y) Substitution of A3 into left side of A9 

using rule Rll, 
All. x - f (g(g(x)) 1 x,g(x)) Substitution of A3 Into left side of A10 

using rule Rll. 
A12. g(g(x)J- x Substitution of A3 into All using rule Rll. 
A13. f(x t g(y),y) - x Substitution of A12 into A3 using rule Rll. 
A14. ffy^gU), fCx^z)) * f<x,y,y) Substitution of A13 into right 

side of A5 using rule Rll. 
A15. ffy^gU), f(x t y,0] - y Substitution of A2 into A1A using rule Rll. 
Al$, f (y > g(g(y>) ,x) - y Substitution of A3 into A15 using rule Rll, 
A17. fCx^y) = x Substitution of A12 into A16 using rule Rll* This 



proves axiom Bl, 
AlS. f(x»y t x) ■ x Substitution of A6 Into A17 usinfc rule Rll, 
A19. f(g(y)>x.y) » x Substitution of A17 Into A10 using rule Rll. 
A20, f{y,x,g(y>) - x Substitution of A12 into A19 using rule Rll. 
A21. f(x,y,x) - f(g(y)>y,x ) Substitution of A20 into left side of A9 

using rule Rll, 
A22, f(g(y)»y»x) - x Substitution of A18 into A21, This completes the 

proof since A22 contradicts M » 



Example 4 : In a group, if x*x*x - c and f(x,y) ■ x*y*x *y for all 



x and y> then f|f<a t b),bj - e 

Al, x*(y*z) « (x*y)*z 

A2, x = yox*z ■ y*z 

A3, x - y^ztt • z*y 

A4. x*z - yft^x ■ y 

AS, z*x = z*y Z> x * y 

A6. x*c = x 

A7, e*x « x 

AS. x*x _1 - e 

A9. x" *x ■ e 

A10, x*x*x - e 



All, f<x,y> - x-7^" 1 ^" 1 
A12 *(f[f(a,b),b) - *} 

Example A was discussed in the appendix to [131 in which it was 
shown that a paramodulation proof of this theorem could be found which 



took only 47 steps whereas a conparaMe proof using ordinary resolution 
took 136 steps- However, these proofs were not obtained from an actual 
procedure for finding proofs but were the product of an inspired human 
effort. By contrast, the first computer produced proof of this theorem 
was obtained by the present program. This proof was 44 steps long and 
took 30 minutes during which 415 new formulas were created. 

Example 5 r Let K be * subgroup of group C and Kog be the right coset 

of K in G for some g£G. Then Kog is identical to K if and only if 

g is a aerobe r of K. 

Al. x*(y*z) *(x*y)*z 

A2» x ■ y z> x*x ■ y*z 

A3, x - y 3 z*x - z*y 

A4. x*z - y*z 2) x = y 

A5 , z*x ■ zity 3 x = y 

A6- x*e ■ x 

A7, «*x - x 

A8- x*?T l = e 

A9. x~ l *x = c 

A10. (x" 1 )" 1 o x 

All. (f(x,y) exjf(x,y) cy) ^ xcy 

A12. xcy Z> < 2 ex 3 zcy) 

A13.<zex^/ z£y> 3 zex^jy 

A14. 2£xUy3 (zex v zey) 

A15, (2cx^vzey) 3 zexHy 



A16, z£x<^y z> (zex a zcy) 
Ai7, (zex/NMzey)) d zex - y 
A18. ze x - y ^fzExAMzt)')' 
A19. (xcyAycx) ^ x = y 
A20. x -.yD(xCyAycx) 



121. /((g(x)cxAh(x) ex) 3«(x)*hW oc) 



) 



:?s(x) 

A22, s(z) 3 (eEz^ ( x ez o x ez) ^ (x* Cz ^ x£z)) 

A23, s(z) D (x*yez v Mxez) s/Myez)) 

A24< s(K) 

A25. xeKoz D (x = bj(x^)kab(x,2) eK) 

A26* x*z eKozvMxeK) 

A27. ^((Kog 5 K D geK) ^ (geK O Kog I K)l 

The program found a 46 seep proof to exanple 5 in 8 minutes during 
which 219 new formulas were generated 

Example 6: If H and K are subgroups of group G» then the product set 
KK is a subgroup of G if and only if HK is identical to KR< 

A1-A22. Same as in Example 5. 

A23. s(x) 3> [s(y) z> (zCp(x,y) 3 (2 - m(x,y ,z)*n(x,y , z) /MiiCx,? »*)ex 
^n(x,y t z)e?))) 



Axioms A2 through A10 were not allowed to interact directly with 
each other since the remaining initial axioms were placed in a set of 
support [20], The purpose was only to save a littla computer time as 
these nine axioms were very familiar and their initial effects quite 
predictable. 



A24. s(x) D (»(y)0 {u*v ep(x,y> vMuex) vMvey))) 

AM. *(aOz> (»(y) => {e*w ep(x,v> ^^s » u«) v*(w = t*v) 

n/Mucx) vMvey) vMr*t epte^y)))) 
A26. »(H> 
A27. s(K> 
A28.M (p(H,K) r p(K,H) D a{p<H > K>>)^ (s(p(H,K)> ^> p<H,K> = p(K,H)) j 

The program found a 136 step proof Co Example 6 in 72 minutes during 

14 
which 960 new formulas were created. 

To this author's knowledge, examples 5 and 6 have never appeared 

before in the literature on automatic theorem proving- Each of these 

examples is noteworthy in that the computer was confronted with a 

very rich set of initial axioms. Both of these examples (and especially 

Example 6) should prove to be quite a challenge to machine oriented 

automatic theorem Drovers. 
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